---
title: "Replication Material 'Home turf: International organizations and earmarked funding'"
author: "Timon Forster"
date: "September 2025"
output: html_notebook
---

# Introduction and set-up

This R Notebook provides the code to replicate all analyses in 'Home turf: International organizations and earmarked funding', published with *the Review of International Organizations*.

Before merging different datasets, inspecting descriptive statistics, and running the regression analyses, load the relevant packages and set the directory to datasets.


```{r}
# Load packages
library(dplyr)
library(magrittr)
library(tidyr)
library(ggplot2)
library(gghighlight)
library(scales)
library(rnaturalearth)
library(quanteda) 
library(stringr)
library(stargazer)
library(patchwork)
library(cowplot)
library(modelsummary)
library(clubSandwich)
library(fixest)
library(marginaleffects)

# Set directory
dir_local <- "C:/Users/timon/Dropbox/Research/HQ_EFIO/Submission/RIO/Replication/"
```

```{r}
# Load fonts for figures
library(hrbrthemes)
import_roboto_condensed()
```

# Data

## Earmarked funding (Components 1, 2 + 4)

First, the subject of my analysis are only parent IOs with a permanent secretariat that have implementing capacity. Based on the Earmarked Funding Dataset (Reinsberg et al. 2024, https://doi.org/10.1007/s11558-024-09548-1) 'Component 1', I create the following list of 255 IOs, including the manually coded HQs (Component 1, filtering `parentID` == `childID` and `opio` == 1).

```{r}
# Load sample, based on Component 1 (List of organizations and their earmarked funding facilities)
dt_sample <- readRDS(paste0(dir_local, "/EFIO_Sample_Sept2025.rds"))
```


Now, let's examine Component 2, an activity-level database of earmarked activities by donors with operational IOs.

Here, I also leverage one of the key advantages of the Earmarked Funding Dataset---that it allows for capturing the stringency of earmarking at two levels. Earmarking can occur at the parent level, for example when a donor pays into a trust fund that supports certain parts of the overall IO mandate. This is the focus of my research. (Earmarking can also occur at the child level, for example when donor earmarks within a trust fund so that its contribution cannot support the full range of programmatic activities of the trust fund.)

In each of the following three dimensions, activities can be strictly earmarked, softly earmarked, or not earmarked.

* *Thematic earmarking* indicates the (sub-)sector focus of an activity. No earmarking exists when the contribution can be used to support any activity within the IO mandate (`thm` = 0 and `prj` = 0 in the dataset). An activity is softly earmarked if it supports a broadly defined theme that is a subset of the overall mandate of the IO (`thm` = 1 and `prj` = 0). An activity is strictly earmarked if it supports a narrowly defined theme, typically in the form of a distinct project (`thm` = 0 and `prj` = 1). Note that `thm` = 1 and `prj` = 1 cannot hold at the same time (yes, verified this, see `summary(efio_comp2$thm + efio_comp2$prj)`).

* *Geographic earmarking* limits the geographic scope of an activity within the IO mandate. For example, for a global IO, an activity in a specific world region is softly earmarked (`reg` = 1). A country-specific activity is strictly earmarked (`cty` = 1). No earmarking exists when the geographical scope of a contribution exactly matches the IO mandate (`reg` = 0 and `cty` = 0). Note that `reg` = 1 and `cty` = 1 cannot hold at the same time (yes, verified this, see `summary(efio_comp2$reg + efio_comp2$cty)`).

* *Institutional earmarking* refers to donor restrictions that restrict IO autonomy in other ways. Soft forms of earmarking include that funds are directed to specific institutional units (`inst` = 1), or specific institutional actors within the organization (`staffco` = 1). Strictly earmarked activities involve bilateral secondment of donor staff into the IO administration (`staffbi` = 1). Different types of institutional earmarking are not mutually exclusive, for instance if staff is seconded to a specific unit rather than the overall IO.

```{r}
# Component 2: Earmarked aid activities in the multilateral system
dt_EFIO_Component2 <- readRDS(paste0(dir_local, "/EFIO_Component2.rds")) %>% 
  # keep only parent IOs (only operational are included by default)
  filter(parentID == childID) %>% 
  # keep only orgs that are in sample
  filter(parentID %in% unique(dt_sample$parentID)) %>% 
  # compute stringency
  mutate(str_tot = (1 * thm + 2 * prj) +
           (1 * reg + 2 * cty) +
           (pmax(inst, staffco, 2 * staffbi, na.rm = TRUE))) %>% 
  # keep only relevant variables
  select(i, year, donorcode, iso3, agencycode, projectnumber, recipientcode, recipientname, rcode,
         purposecode, sectorcode, usd_commitment_defl, parentID, childID,
         thm, prj, reg, cty, inst, staffbi, staffco, em, unem, str_tot)
```


## Diplomatic missions

In addition, I include a control variable for diplomatic missions --- the data and the country codes.

```{r}
dt_DIPL <- readRDS(paste0(dir_local, "/Diplomatic-Representation_Sept2025.rds"))
```


## Define DAC

The sample concerns only the earmarked funding by DAC donors; hence I load a file with a dummy variable `donor_DAC`.

```{r}
dt_EFIO_cname <- read.csv(paste0(dir_local, "/EFIO_cname.csv"),
                          fileEncoding = "UTF-8-BOM", header = TRUE)
```


## Merging

Now, I can merge these datasets together and define additional variables, including the log of disbursements, the salience of an IO for a donor-country, and FEs.


```{r}
dt_HQEFIO <- dt_sample %>% 
  # add flows and stringency (component 2)
  left_join(dt_EFIO_Component2, by = c("parentID")) %>%
  # add DIPL data
  left_join(dt_DIPL, by = c("rcode" = "dest_iso3",
                                  "iso3" = "send_iso3",
                                  "year")) %>%
  # attach donor country names and subset to DAC donors
  left_join(dt_EFIO_cname, by = c("iso3")) %>% 
  filter(donor_DAC == 1) %>%
  # keep only parents
  filter(parentID == childID) %>% 
  # remove if no HQ
  filter(!is.na(hq_city)) %>% 
  # remove obs before establishment/after end, likely coding errors
  filter(year >= yestab_w) %>% 
  mutate(yeend_w = replace_na(yeend_w, 2999)) %>% 
  filter(year <= yeend_w) %>%
  # calculate years since established (age), and center variable
  mutate(yr_sestab = year - yestab_w) %>% 
  mutate(yr_sestab_cent = yr_sestab - mean(yr_sestab, na.rm = TRUE)) %>% 
  # generate dimension specific stringency
  mutate(str_thematic = (1 * thm + 2 * prj),
         str_geographic = (1 * reg + 2 * cty),
         str_institutional = (pmax(inst, staffco, 2 * staffbi, na.rm = TRUE))) %>% 
  # define treatment variable
  mutate(hq_aligned = if_else(hq_country == donor_cname, 1, 0)) %>% 
  # measure salience (number of projects from donor to org in a given year)
  group_by(iso3, parentID, year) %>%
  mutate(n_projects = n(), .groups = "keep") %>% 
  ungroup() %>%
  # generate COW dummy
  mutate(is_COW = if_else(!is.na(io_num), 1, 0)) %>% 
  # define logs (commitments and IO salience)
  mutate(ln_comm = sign(usd_commitment_defl) * 
           log(abs(usd_commitment_defl + 1)),
         ln_projects = log(n_projects)) %>% 
  # calculate FEs
  mutate(donor_FE = as.factor(iso3),
         sector_FE = as.factor(sectorcode),
         year_FE = as.factor(year),
         parent_FE = as.factor(parentID),
         recipient_FE = as.factor(recipientcode),
         donor_recipient_FE = paste0(iso3, recipientcode),
         donor_IO_FE = paste0(iso3, parentID))
```


Let's save this file.

```{r}
saveRDS(dt_HQEFIO, paste0(dir_local, "/dt_HQEFIO_Sept2025.rds"))
```




# Descriptive statistics

First, let's load the data (see above for the merging).

```{r}
dt_HQEFIO <- readRDS(paste0(dir_local, "/dt_HQEFIO_Sept2025.rds"))
```


How many donors (DAC donors), IOs (as defined above), years, recipients (not just countries) are in the dataset?

```{r}
n_distinct(dt_HQEFIO$iso3)
n_distinct(dt_HQEFIO$parentID)
n_distinct(dt_HQEFIO$year)
n_distinct(dt_HQEFIO$recipientname)
n_distinct(dt_HQEFIO$sectorcode)
```

```{r}
summary(dt_HQEFIO$usd_commitment_defl)
summary(dt_HQEFIO$str_tot)
```




## Location of IOs

```{r}
# Load data with number of IOs and lon/lat of HQ
dt_HQmap <- readRDS(paste0(dir_local, "/dt_HQmap_Sept2025.rds"))

# Get world map data
world_map <- map_data("world") %>% 
  filter(region != "Antarctica") %>% 
  # attach donor_DAC
  mutate(donor_DAC = if_else(region %in% c("Australia", "Austria", "Belgium", "Canada", "Czech Republic", "Denmark", "Estonia", "Finland", "France", "Germany", "Greece", "Hungary", "Iceland", "Ireland", "Italy", "Japan", "South Korea", "Latvia", "Lithuania", "Luxembourg", "Netherlands", "New Zealand", "Norway", "Poland", "Portugal", "Slovakia", "Slovenia", "Spain", "Sweden", "Switzerland", "UK", "USA"),
                             "DAC donor", "Other country")) 
```


Now let's visualize this.

```{r}
ggplot() +
  geom_polygon(data = world_map, 
               aes(x = long, y = lat, group = group, fill = donor_DAC), 
               color = "black", linewidth = 0.015) + #  fill = "gray90"
  geom_point(data = dt_HQmap, 
             aes(x = lon, y = lat, size = n_IGOs),
             color = "#3B528BFF", alpha = 0.5) +
  scale_size(range = c(1, 5)) +  # Adjust circle sizes
  theme_bw() +
  hrbrthemes::theme_ipsum_rc() +
  labs(x = element_blank(), # "Longitude"
       y = element_blank(), # "Latitude"
       size = "Number of IGOs",
       fill = NULL) + 
  theme(legend.position = "bottom",
        strip.text = element_text(size = 17),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank()) + 
  scale_fill_manual(values = c("#C4A484", "gray90")) + # #1F9E89FF 83F28F 008000 
  guides(
    fill = guide_legend(order = 1),
    size = guide_legend(order = 2)
  )
# ggsave(filename = paste0(dir_local, "/HQ_map_Sept2025.png"),
#        device = "png", height = 5, width = 8.4, unit = "in", bg = "white")
```


Some descriptives.

```{r}
dt_sample %>% 
  # select relevant variables
  select(parentID, hq_city, hq_country) %>% 
  unique() %>% 
  # group_by
  group_by(hq_country) %>% 
  summarize(n_IOs = n(), .groups = "keep") %>% 
  ungroup() %>% 
  arrange(desc(n_IOs))
```


Number of cities and countries of IO HQs:

```{r}
n_distinct(dt_sample$hq_city)
n_distinct(dt_sample$hq_country)
```


## Mean stringency

```{r}
summary(dt_HQEFIO$hq_aligned)
summary(dt_HQEFIO$str_tot)
```


```{r}
dt_HQEFIO %>%
  # Calculate group mean
  group_by(hq_aligned) %>%
  summarize(str_mean = mean(str_tot, na.rm = TRUE), .groups = "keep") %>%
  ungroup() %>%
  mutate(hq_aligned_fact = factor(hq_aligned, levels = c(1, 0),
                                 labels = c("Yes", "No"), ordered = TRUE)) %>% 
  # Visualize
  ggplot(aes(x = hq_aligned_fact, y = str_mean, fill = hq_aligned_fact)) +
  geom_bar(stat = "summary") +
  geom_text(aes(label = round(str_mean, 2)), vjust = 5, size = 4.2) +
  theme_bw() +
  hrbrthemes::theme_ipsum_rc() +
  theme(legend.position = "None",
        axis.title.x = element_text(size = 15),
        axis.title.y = element_text(size = 15),
        panel.grid.minor.x = element_blank(),
        axis.text.x = element_text(size = 14),
        axis.text.y = element_text(size = 14)) +
  labs(x = "\nHQ located in donor-country",
       y = "Stringency of earmarked funding\n") +
  scale_y_continuous(limits = c(0, 3)) +
  scale_fill_manual(values = c("#FFC000", "#A24857"))
# ggsave(filename = paste0(dir_local, "/STR_HQ_Sept2025.png"),
#        dpi = 300, device = "png", height = 5, width = 8, bg = "white")
```


```{r}
dt_HQEFIO %>%
  group_by(hq_aligned) %>%
  summarize(str_mean = mean(str_tot, na.rm = TRUE)) %>%
  ungroup()
```


What are these values for the United States, Switzerland, and France:

```{r}
dt_HQEFIO %>%
  filter(iso3 %in% c("CHE", "USA", "FRA")) %>% 
  group_by(iso3, hq_aligned) %>%
  summarize(str_mean = mean(str_tot, na.rm = TRUE), .groups = "keep") %>%
  ungroup()
```


### US case

"For IOs headquartered in the United States, the mean level of stringency is 2.70 (SD = 0.86), whereas for IOs headquartered abroad it rises to 2.87 (SD = 0.88)."

```{r}
# US + IOs with HQ in US
dt_US_HQ <- dt_HQEFIO %>% 
  filter(iso3 == "USA") %>%
  filter(hq_aligned == 1)
# US + IOs with HQ outside US
dt_US_nHQ <- dt_HQEFIO %>% 
  filter(iso3 == "USA") %>%
  filter(hq_aligned == 0)
```


```{r}
mean(dt_US_HQ$str_tot)
sd(dt_US_HQ$str_tot)
```

```{r}
mean(dt_US_nHQ$str_tot)
sd(dt_US_nHQ$str_tot)
```  

"US earmarking to DC-based organizations is less stringent than to New York-based IOs (2.36 vs. 2.80), especially with regard to geographic conditions."

```{r}
# US Washington, DC
dt_US_HQ %>% 
  filter(hq_city == "Washington, DC") %>%
  summarize(str_thematic = mean(str_thematic, na.rm = TRUE),
            str_geographic = mean(str_geographic, na.rm = TRUE),
            str_institutional = mean(str_institutional, na.rm = TRUE),
            str_tot = mean(str_tot, na.rm = TRUE)) %>% 
  ungroup()
```

```{r}
# US, NYC
dt_US_HQ %>% 
  filter(hq_city == "New York, NY") %>%
  summarize(str_thematic = mean(str_thematic, na.rm = TRUE),
            str_geographic = mean(str_geographic, na.rm = TRUE),
            str_institutional = mean(str_institutional, na.rm = TRUE),
            str_tot = mean(str_tot, na.rm = TRUE)) %>% 
  ungroup()
```


"But US delegation of authority to organizations in both of these cities, on average, is still more lenient than to institutions in Rome (3.11), including the World Food Programme and the Food and Agricultural Organization (FAO)."


```{r}
# US, Rome
dt_US_nHQ %>% 
  filter(hq_city == "Rome") %>%
  summarize(str_thematic = mean(str_thematic, na.rm = TRUE),
            str_geographic = mean(str_geographic, na.rm = TRUE),
            str_institutional = mean(str_institutional, na.rm = TRUE),
            str_tot = mean(str_tot, na.rm = TRUE)) %>% 
  ungroup()
```


### Additional statistics

"In total, 147 organizations were created prior to 1990, whereas 108 IOs have been established since."

```{r}
dt_HQEFIO %>% 
  filter(yestab_w < 1990) %>% 
  select(acronym, name, yestab_w) %>% 
  unique()
```

```{r}
dt_HQEFIO %>% 
  filter(yestab_w >= 1990) %>% 
  select(acronym, name, yestab_w) %>% 
  unique()
```



# Regression analysis

First, let's load the data (see above for the merging).

```{r}
dt_HQEFIO <- readRDS(paste0(dir_local, "/dt_HQEFIO_Sept2025.rds"))
```


## Baseline

Table 1: HQ alignment with donor-country and stringency

```{r}
# No FEs
m1_noFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# Year FEs
m2_yFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# Donor + Year FEs
m3_dyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | donor_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# Donor + Recipient + Year FEs
m4_dryFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | donor_FE + recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# Donor-Recipient + Year FEs
m5_dyyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```

```{r}
# Custom variable labels
var_labels <- c(
  hq_aligned = "**HQ located in donor-country**",
  lncomm = "Commitments (log USD)",
  ln_projects = "Number of projects (log)"
)

# Custom model names
model_names <- c(
  "No FEs",
  "Year FEs",
  "Donor + Year FEs",
  "Donor + Recipient + Year FEs",
  "Donor-Recipient + Year FEs"
)

# Create the table
modelsummary(
  list(m1_noFE, m2_yFE, m3_dyFE, m4_dryFE, m5_dyyFE),
  coef_map = var_labels,
  gof_map = c("r.squared", "r2.within", "nobs"),
  stars = c('*' = .1, '**' = .05, '***' = .01),
  statistic = "({std.error})",
  notes = "Constant suppressed. Standard errors clustered on the donor-IO dyad in parentheses. ∗p<0.10; ∗∗p<0.05; ∗∗∗p<0.01.",
  title = "Dependent variable: Stringency of earmarked funding",
  model_names = model_names,
  output = "T1_baseline_20250923.tex"
)
```



```{r}
summary(dt_HQEFIO$str_tot)
sd(dt_HQEFIO$str_tot, na.rm = TRUE)
```


## Robustness checks

### A1: Sector fixed effects

```{r}
m1_sFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | sector_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m2_syFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | sector_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m3_sdyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | sector_FE + donor_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m4_sdryFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | sector_FE + donor_FE + recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m5_sdyyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | sector_FE + donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```

```{r}
# Custom variable labels
var_labels <- c(
  hq_aligned = "**HQ located in donor-country**",
  lncomm = "Commitments (log USD)",
  ln_projects = "Number of projects (log)"
)

# Custom model names
model_names <- c(
  "No FEs",
  "Year FEs",
  "Donor + Year FEs",
  "Donor + Recipient + Year FEs",
  "Donor-Recipient + Year FEs"
)

# Create the table
modelsummary(
  list(m1_sFE, m2_syFE, m3_sdyFE, m4_sdryFE, m5_sdyyFE),
  coef_map = var_labels,
  gof_map = c("r.squared", "r2.within", "nobs"),
  stars = c('*' = .1, '**' = .05, '***' = .01),
  statistic = "({std.error})",
  notes = "Constant suppressed. Standard errors clustered on the donor-IO dyad in parentheses. ∗p<0.10; ∗∗p<0.05; ∗∗∗p<0.01.",
  title = "Dependent variable: Stringency of earmarked funding",
  model_names = model_names,
  output = "A1_sectorFE_20250923.tex"
)
```


### A2: Stringency (log)

```{r}
m1_noFE <- feols(log(str_tot + 1) ~ hq_aligned + ln_comm + ln_projects,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m2_yFE <- feols(log(str_tot + 1)  ~ hq_aligned + ln_comm + ln_projects | year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m3_dyFE <- feols(log(str_tot + 1)  ~ hq_aligned + ln_comm + ln_projects | donor_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m4_dryFE <- feols(log(str_tot + 1)  ~ hq_aligned + ln_comm + ln_projects | donor_FE + recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m5_dyyFE <- feols(log(str_tot + 1)  ~ hq_aligned + ln_comm + ln_projects | donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```


```{r}
# Custom variable labels
var_labels <- c(
  hq_aligned = "**HQ located in donor-country**",
  lncomm = "Commitments (log USD)",
  ln_projects = "Number of projects (log)"  
)

# Custom model names
model_names <- c(
  "No FEs",
  "Year FEs",
  "Donor + Year FEs",
  "Donor + Recipient + Year FEs",
  "Donor-Recipient + Year FEs"
)

# Create the table
modelsummary(
  list(m1_noFE, m2_yFE, m3_dyFE, m4_dryFE, m5_dyyFE),
  coef_map = var_labels,
  gof_map = c("r.squared", "r2.within", "nobs"),
  stars = c('*' = .1, '**' = .05, '***' = .01),
  statistic = "({std.error})",
  notes = "Constant suppressed. Standard errors clustered on the donor-IO dyad in parentheses. ∗p<0.10; ∗∗p<0.05; ∗∗∗p<0.01.",
  title = "Dependent variable: Stringency of earmarked funding (log)",
  model_names = model_names,
  output = "A2_log_20250923.tex"
)
```


### A3: No controls

```{r}
# No Controls
m1_noFE <- feols(str_tot ~ hq_aligned,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m2_yFE <- feols(str_tot ~ hq_aligned | year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m3_dyFE <- feols(str_tot ~ hq_aligned | donor_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m4_dryFE <- feols(str_tot ~ hq_aligned | donor_FE + recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m5_dyyFE <- feols(str_tot ~ hq_aligned | donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```


```{r}
# Custom variable labels
var_labels <- c(
  hq_aligned = "**HQ located in donor-country**"
)

# Custom model names
model_names <- c(
  "No FEs",
  "Year FEs",
  "Donor + Year FEs",
  "Donor + Recipient + Year FEs",
  "Donor-Recipient + Year FEs"
)

# Create the table
modelsummary(
  list(m1_noFE, m2_yFE, m3_dyFE, m4_dryFE, m5_dyyFE),
  coef_map = var_labels,
  gof_map = c("r.squared", "r2.within", "nobs"),
  stars = c('*' = .1, '**' = .05, '***' = .01),
  statistic = "({std.error})",
  notes = "Constant suppressed. Standard errors clustered on the donor-IO dyad in parentheses. ∗p<0.10; ∗∗p<0.05; ∗∗∗p<0.01.",
  title = "Dependent variable: Stringency of earmarked funding",
  model_names = model_names,
  output = "A3_excontrols_20250923.tex"
)
```



### A4: Additional controls

```{r}
# New control: Purpose of IO
ma_dyyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects + allpurpose | donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# New control: Global IO
mc_dyyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects + global | donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# New control: COW organization
md_dyyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects + is_COW | donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# New control: Diplomatic representation
me_dyyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects + lor_DIPL | donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```


```{r}
# Custom variable labels
var_labels <- c(
  hq_aligned = "**HQ located in donor-country**",
  lncomm = "Commitments (log USD)",
  ln_projects = "Number of projects",
  allpurpose = "All purpose IO",
  global = "Global IO",
  is_COW = "COW organization",
  lor_DIPL = "Diplomatic representation"
)

# Custom model names
model_names <- c(
  "Year FEs",
  "Donor + Year FEs",
  "Donor + Recipient + Year FEs",
  "Donor-Recipient + Year FEs"
)

# Create the table
modelsummary(
  list(ma_dyyFE, mc_dyyFE, md_dyyFE, me_dyyFE),
  coef_map = var_labels,
  gof_map = c("r.squared", "r2.within", "nobs"),
  stars = c('*' = .1, '**' = .05, '***' = .01),
  statistic = "({std.error})",
  notes = "Constant suppressed. Standard errors clustered on the donor-IO dyad in parentheses. ∗p<0.10; ∗∗p<0.05; ∗∗∗p<0.01.",
  title = "Dependent variable: Stringency of earmarked funding",
  model_names = model_names,
  output = "A4_controls_20250923.tex"
)
```



### Leave-one-out (Figure 3)

Let's see whether these results are driven by an individual country.

```{r}
# Create an empty dataframe to store results, pre-allocate results
unique_donors <- sort(unique(dt_HQEFIO$iso3))
results <- vector("list", length(unique_donors))

for (i in seq_along(unique_donors)) {
  donor_country <- unique_donors[i]
  dt_sub <- dt_HQEFIO[dt_HQEFIO$iso3 != donor_country, ]

  model <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | donor_recipient_FE + year_FE,
                 data = dt_sub, cluster = ~ donor_IO_FE)

  results[[i]] <- data.frame(
    Donor_Excluded = donor_country,
    HQ_Coefficient = coef(model)["hq_aligned"],
    Standard_Error = se(model)["hq_aligned"]
  )
}

# Store results
dt_results_loo <- do.call(rbind, results) %>% 
  # add 95% ci
  mutate(CI_Lower = HQ_Coefficient - 1.96 * Standard_Error,
         CI_Upper = HQ_Coefficient + 1.96 * Standard_Error)
```


Let's also re-estimate the baseline model.

```{r}
# Donor-Recipient + Year FEs
m5_dyyFE <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | donor_recipient_FE + year_FE,
           data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```

Visualization with density.

```{r}
# baseline
# -0.100903 new
```

```{r}
# Print or save the results dataframe
dt_results_loo %>% 
  ggplot(aes(x = HQ_Coefficient)) +
  geom_density(fill = "gray") +
  geom_vline(xintercept = -0.100903, color = "black", linewidth = 1, linetype = "dashed") +
  annotate("text", x = -0.097, y = 250, label = "Baseline\n-0.101", color = "black") +
  annotate("text", x = -0.086, y = 50, 
           label = "excl. US\n -0.086", color = "black") +
  annotate("text", x = -0.120, y = 50, 
           label = "excl. Switzerland\n -0.123", color = "black") +
  hrbrthemes::theme_ipsum_rc() +
  theme(legend.position = "None",
        axis.title.x = element_text(size = 15),
        axis.title.y = element_text(size = 15),
        axis.text.x = element_text(size = 14),
        axis.text.y = element_text(size = 14),
        panel.grid.minor.x = element_blank(),
        panel.grid.minor.y = element_blank()) +
  labs(x = "Coefficient: HQ located in donor-country",
       y = "Density")
# ggsave(filename = paste0(dir_local, "/LOO_Sept2025.png"),
#        dpi = 300, device = "png", height = 5, width = 8, bg = "white")
```



```{r}
dt_results_loo %>% 
  arrange(desc(HQ_Coefficient)) %>% 
  head()
```

```{r}
dt_results_loo %>% 
  arrange(desc(HQ_Coefficient)) %>% 
  tail()
```



## Additional analyses

### B1: Disaggregated by type of stringency

```{r}
# Disaggregating results by type of stringency: thematic, geographic, or institutional
m_them <- feols(str_thematic ~ hq_aligned + ln_comm + ln_projects | donor_recipient_FE + year_FE,
                data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m_geo <- feols(str_geographic ~ hq_aligned + ln_comm + ln_projects | donor_recipient_FE + year_FE,
               data = dt_HQEFIO, cluster = ~ donor_IO_FE)

m_inst <- feols(str_institutional ~ hq_aligned + ln_comm + ln_projects | donor_recipient_FE + year_FE,
                data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```


```{r}
# Custom variable labels
var_labels <- c(
  hq_aligned = "**HQ located in donor-country**",
  lncomm = "Commitments (log USD)",
  ln_projects = "Number of projects (log)"  
)

# Custom model names
model_names <- c(
  "Thematic",
  "Geographic",
  "Institutional"
)

# Create the table
modelsummary(
  list(m_them, m_geo, m_inst),
  coef_map = var_labels,
  gof_map = c("r.squared", "r2.within", "nobs"),
  stars = c('*' = .1, '**' = .05, '***' = .01),
  statistic = "({std.error})",
  notes = "Constant suppressed. Standard errors clustered on the donor-IO dyad in parentheses. ∗p<0.10; ∗∗p<0.05; ∗∗∗p<0.01.",
  title = "Dependent variable: Stringency of earmarked funding",
  model_names = model_names,
  output = "B1_disaggreg_20250923.tex"
)
```




### B2: Temporal dynamics

#### Figure 4: Marginal effect of HQ alignment on stringency, over IO age

```{r}
# 1. Model with interaction of HQ alignment and years since establishment
m_time_raw <- feols(
  str_tot ~ hq_aligned * yr_sestab + ln_comm + ln_projects | donor_recipient_FE + year_FE, 
  data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```

```{r}
# 2. Marginal effects of hq_aligned over IGO age (0 to 155)
slope_estimates_raw <- slopes(
  model = m_time_raw,
  variables = "hq_aligned",
  newdata = datagrid(model = m_time_raw, yr_sestab = seq(0, 155, by = 1))
)
```

```{r}
# 3. Create support (binned) data for IGO age (RHS of figure)
bins <- dt_HQEFIO %>%
  filter(yr_sestab >= 0 & yr_sestab <= 155) %>%
  mutate(bin = cut(yr_sestab, breaks = seq(0, 160, by = 5), right = FALSE)) %>%
  count(bin) %>%
  filter(!is.na(bin)) %>%
  mutate(
    bin_clean = gsub("\\[|\\)", "", bin),
    lower = as.numeric(sub(",.*", "", bin_clean)),
    upper = as.numeric(sub(".*,", "", bin_clean)),
    mid = (lower + upper) / 2
  )
```


Now, bring this all together and produce figure.

```{r}
# 1. Define bar limits
bar_min_y <- -0.75
bar_max_y <- -0.4

# 2. Rescale number of observations to y-range
bins$n_scaled <- scales::rescale(bins$n, to = c(bar_min_y, bar_max_y))

# 3. Plot
ggplot() +
  # A. Draw gray bars manually using geom_rect (from -0.75 upward)
  geom_rect(
    data = bins,
    aes(
      xmin = mid - 2.5,
      xmax = mid + 2.5,
      ymin = bar_min_y,
      ymax = n_scaled
    ),
    fill = "gray70"
  ) +

  # B. Marginal effect ribbon
  geom_ribbon(
    data = slope_estimates_raw,
    aes(x = yr_sestab, ymin = conf.low, ymax = conf.high),
    fill = "steelblue", alpha = 0.2
  ) +

  # C. Marginal effect line
  geom_line(
    data = slope_estimates_raw,
    aes(x = yr_sestab, y = estimate),
    linewidth = 1.2, color = "steelblue"
  ) +

  # D. Zero reference line
  geom_hline(yintercept = 0, color = "black") +

  # E. Axes
  scale_x_continuous(limits = c(0, 150)) +
  scale_y_continuous(
    name = "Marginal effect of HQ alignment\n",
    sec.axis = sec_axis(
      trans = ~ scales::rescale(., from = c(bar_min_y, bar_max_y), to = c(min(bins$n), max(bins$n))),
      name = "\nNo. Obs.",
      breaks = c(0, 10000, 20000),
      labels = comma
    )
  ) +

  # F. Theme
  theme_bw() +
  hrbrthemes::theme_ipsum_rc() +
  theme(
    legend.position = "none",
    axis.title.x = element_text(size = 15),
    axis.title.y.left = element_text(size = 15),
    axis.title.y.right = element_text(
      size = 15,
      hjust = 0.2
    ),
    axis.text = element_text(size = 14),
    panel.grid.minor.x = element_blank()
  ) +

  # G. Final label
  labs(x = "\nYears since international organization established")
# ggsave(filename = paste0(dir_local, "/TIME_Sept2025.png"),
#        dpi = 300, device = "png", height = 5, width = 8, bg = "white")
```



#### Table B2: HQ located in donor-country and stringency: temporal dynamics


```{r}
# Model with interaction of HQ alignment and years since establishment
m_time_raw <- feols(
  str_tot ~ hq_aligned * yr_sestab + ln_comm + ln_projects | donor_recipient_FE + year_FE, 
  data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# Poly age
m_time_polyage <- feols(
  str_tot ~ hq_aligned * yr_sestab + poly(yr_sestab, 2) + ln_comm + ln_projects | donor_recipient_FE + year_FE, 
  data = dt_HQEFIO, cluster = ~ donor_IO_FE)

# Donor specific time trends
m_time_trend <- feols(
  str_tot ~ hq_aligned * yr_sestab + ln_comm + ln_projects | donor_recipient_FE + year_FE + iso3[year], 
  data = dt_HQEFIO, cluster = ~ donor_IO_FE)
```


```{r}
# Custom variable labels
var_labels <- c(
  hq_aligned = "**HQ located in donor-country**",
  lncomm = "Commitments (log USD)",
  ln_projects = "Number of projects (log)"
)

# Custom model names
model_names <- c(
  "Interaction",
  "Non-linear",
  "Time trends"
)

# Create the table
modelsummary(
  list(m_time_raw, m_time_polyage, m_time_trend),
  # coef_map = var_labels,
  gof_map = c("r.squared", "r2.within", "nobs"),
  stars = c('*' = .1, '**' = .05, '***' = .01),
  statistic = "({std.error})",
  notes = "Constant suppressed. Standard errors clustered on the donor-IO dyad in parentheses. ∗p<0.10; ∗∗p<0.05; ∗∗∗p<0.01.",
  title = "Dependent variable: Stringency of earmarked funding",
  model_names = model_names,
  output = "B2_TIME_20250923.tex"
)
```






## Illustrative case study: Switzerland

### Data subset

```{r}
# Subset: only Swiss donor observations
dt_CHE <- subset(dt_HQEFIO, donor_FE == "CHE") %>% 
  # Define Swiss priority orgs
  mutate(CHE_priority = if_else(name %in% c(
    "World Bank Group",
    "Asian Development Bank",
    "African Development Bank",
    "Inter-American Development Bank, Inter-American Investment Corporation and Multilateral Investment Fund",
    "International Organization for Migration",
    "United Nations Development Programme",
    "United Nations Children's Fund",
    "United Nations Population Fund",
    "United Nations Entity for Gender Equality and the Empowerment of Women",
    "World Health Organisation - assessed contributions",
    "World Health Organisation - core voluntary contributions account",
    "World Food Programme",
    "Food and Agricultural Organisation",
    "United Nations High Commissioner for Refugees",
    "International Committee of the Red Cross",
    "International Federation of Red Cross and Red Crescent Societies",
    "United Nations Relief and Works Agency for Palestine Refugees in the Near East",
    "Global Environment Facility",
    "Global Fund to Fight AIDS, Tuberculosis and Malaria",
    "Gavi, the Vaccine Alliance", 
    "United Nations Industrial Development Organisation",
    "International Fund for Agricultural Development",
    "Organisation for Economic Co-operation and Development",
    "Office of the United Nations High Commissioner for Human Rights (extrabudgetary contributions only)"
  ), 1, 0))
```


### Regression

Table 2: HQ located in donor-country and stringency: Switzerland

```{r}
# Baseline model (recipient + year FE)
m_che <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | recipient_FE + year_FE,
               data = dt_CHE, cluster = ~ parent_FE)

# + Priority IOs
m_che2 <- feols(str_tot ~ hq_aligned + CHE_priority + ln_comm + ln_projects | recipient_FE + year_FE,
               data = dt_CHE, cluster = ~ parent_FE)

# + Priority IOs interaction
m_che3 <- feols(str_tot ~ hq_aligned * CHE_priority + ln_comm | recipient_FE + year_FE,
               data = dt_CHE, cluster = ~ parent_FE)

# Baseline + sector FEs
m_che4 <- feols(str_tot ~ hq_aligned + ln_comm + ln_projects | sector_FE + recipient_FE + year_FE,
               data = dt_CHE, cluster = ~ parent_FE)

# + Priority IOs + sector FEs
m_che5 <- feols(str_tot ~ hq_aligned + CHE_priority + ln_comm + ln_projects | sector_FE + recipient_FE + year_FE,
               data = dt_CHE, cluster = ~ parent_FE)

# + Priority IOs interaction + sector FEs
m_che6 <- feols(str_tot ~ hq_aligned * CHE_priority + ln_comm + ln_projects | sector_FE + recipient_FE + year_FE,
               data = dt_CHE, cluster = ~ parent_FE)
```


```{r}
# Custom variable labels
var_labels <- c(
  hq_aligned = "**HQ in Switzerland**",
  ln_comm = "Commitments (log USD)",
  ln_projects = "Number of projects (log)"
)

# Custom model names
model_names <- c(
  "Interaction",
  "Non-linear",
  "Time trends"
)

# Create the table
modelsummary(
  list(m_che, m_che2, m_che3, m_che4, m_che5, m_che6),
  # coef_map = var_labels,
  gof_map = c("r.squared", "r2.within", "nobs"),
  stars = c('*' = .1, '**' = .05, '***' = .01),
  statistic = "({std.error})",
  notes = "Constant suppressed. Standard errors clustered on the IO in parentheses. ∗p<0.10; ∗∗p<0.05; ∗∗∗p<0.01.",
  title = "Dependent variable: Stringency of earmarked funding",
  model_names = model_names,
  output = "T2_CHE_20250923.tex"
)
```





